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Abstract 

We  present  a  general  protocol  for  detecting  whether  a  property 
holds  in  a  distributed  system,  where  the  property  is  a  member  of  a 
subclass  of  stable  properties  we  call  the  locally  stable  properties.  Our 
protocol  is  based  on  a  decentralized  method  for  constructing  a  maximal 
subset  of  the  local  states  that  are  mutually  consistent,  which  in  turn  is 
based  on  a  weakened  version  of  vector  time  stamps.  The  structure  of 
our  protocol  lends  itself  to  refinement,  and  we  demonstrate  its  utility 
by  deriving  some  specialized  property-detection  protocols,  including 
two  previously-known  protocols  that  are  known  to  be  efficient. 


1  Introduction 

It  is  conceptually  simple  to  determine  whether  the  global  state  of  a  dis¬ 
tributed  system  satisfies  a  stable  property:  that  is.  a  property  4>  that  satis¬ 
fies  $  =>  □$.  One  can  have  a  process  use  a  snapshot  algorithm  such  as  the 
one  given  in  [CL85]  to  collect  the  relevant  local  and  channel  states  and  then 
test  to  see  if  the  condition  holds  over  the  collected  state.  This  technique  can 

'This  work  was  supported  by  the  Defense  Advanced  Research  Projects  Agency  (DoDt 
under  NASA  Ames  grant  number  NAG  2-593,  and  by  grants  from  IBM  and  Siemens.  The 
views,  opinions,  and  findings  contained  in  this  report  are  those  of  the  authors  and  should 
not  be  construed  as  an  official  Department  of  Defense  position,  policy,  or  decision 
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be  used  to  detect  any  stable  property.  However,  for  many  stable  properties 
of  interest,  such  as  deadlock,  termination,  and  lack  of  a  token,  there  exist 
specialized  protocols  (for  example,  [Mis83,Mat87,BT84,CMH83])  that  are 
more  efficient  than  a  straightforward  application  of  [CL85].  As  well  as  being 
more  efficient,  many  of  these  specialized  protocols  are  very  elegant  and  their 
relation  to  snapshots  is  not  apparent. 

It  would  be  useful  if  one  could  derive  such  special-purpose  protocols 
by  refinement  of  a  general  snapshot  protocol.  Unfortunately,  the  protocol 
of  [CL85]  was  not  developed  with  refinement  in  mind,  and  we  have  not 
found  it  conducive  to  such  refinement.  In  this  paper,  we  present  a  different 
protocol  for  detecting  stable  properties  that  has  proven  to  be  more  conducive 
to  refinement. 

A  naive  general  detection  protocol  is  as  follows:  every  time  a  process 
executes  an  event,  it  appends  its  current  state  to  a  queue  maintained  in 
local  memory.  A  separate  process  po  periodically  retrieves  these  queues  of 
local  histories  and  extracts  from  them  the  latest  global  state.  Process  p0  then 
tests  to  see  if  the  property  holds  in  this  global  state.  Unfortunately,  this 
protocol  is  impractical  since  it  has  a  large  execution  overhead  and  requires 
unbounded  local  memory.  This  can  be  fixed  (at  a  cost  of  generality)  by 
having  each  process  record  only  its  current  state  at  appropriate  times  and 
by  having  po  consider  some  subset  of  these  local  states  that  could  be  part 
of  a  sensible  global  state  of  the  system.  Not  all  stable  properties  can  be 
detected  this  way,  but  it  turns  out  that  most  stable  properties  that  have 
been  discussed  in  the  literature  can. 

In  this  paper,  we  present  a  method  to  detect  a  subclass  of  stable  prop¬ 
erties.  The  method  can  be  easily  expressed  as  a  decentralized  protocol  and 
can  be  customized  for  different  properties  in  order  to  yield  efficient  special- 
purpose  protocols.  We  demonstrate  its  utility  by  using  it  to  derive  such 
protocols  including  two  previously-known  protocols  that  are  known  to  be 
efficient. 
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2  Definitions 


2.1  System  Model 


We  consider  an  asynchronous  distributed  system  consisting  of  a  set  of  n 
nonfaulty  processes  P  =  {pi,  ps, . . .  ,  pn}.  Between  any  two  processes  p,  and 
p:  there  exist  two  unidirectional  nonfaulty  FIFO  channels:  CU]  from  p,  to 
and  ChX  from  pj  to  p,.  These  channels  have  unbounded  delivery  time,  and 
processes  communicate  only  by  sending  and  receiving  messages  over  these 
channels. 

Processes  execute  events,  which  are  partitioned  into  send  events,  receive 
events,  and  local  events.  We  will  denote  the  Zth  event  executed  by  process 
p,  as  e\  and  the  resulting  local  state  o\.  Thus,  the  execution  of  process  p, 
can  be  denoted  (of  e j  of  e f  of  ■  •  •).  Note  that  the  state  cr-  reflects  the 
execution  of  events  ef  through  e-.  When  the  ordinality  of  an  event  or  state 
is  not  important,  we  will  drop  the  superscript,  e.g.  the  execution  of  event 
e,  results  in  the  local  state  <r,. 

An  arbitrary  collection  of  local  states  may  not  constitute  a  sensible  global 
state:  the  local  state  of  a  process  in  the  collection  may  reflect  the  receipt  of 
a  message  while  no  process’  local  state  reflects  the  sending  of  that  message. 
Such  sets  of  local  states  are  called  inconsistent:  a  sensible  collection  of  local 


states  is  called  consistent.  ([CL85]) 

A  global  state  is  defined  to  be  a  consistent  set  H  =  {o\.U2 . <7n}  of  the 

processes’  local  states.  We  assume  that  channel  states  are  captured  in  the 
local  states  of  the  processes.  There  are  many  ways  to  do  this,  for  example 
by  having  each  process  maintain  a  history  of  all  messages  that  it  sends  and 
receives.  In  practice,  one  must  ensure  that  the  representation  of  the  channel 
states  does  not  require  an  unbounded  number  of  messages  to  be  recorded. 


A  consistent  cut  is  defined  to  be  a  set  of  events  C  =  . t\, }  m  Fop 

such  that  the  set  of  states  {<rl ,  02, . . . ,  on }  produced  bv  C  is  a  global  state.  AM 
Thus,  each  consistent  cut  has  a  corresponding  global  state,  and  vice  versa. 

In  this  sense,  global  states  and  consistent  cuts  are  equivalent  notions.  When  .  ». lon_ 


Distribution/ 

lability  CidoaT 
Avail  md/op 
>lat  Spaotni 

l'\.  I 


OO 


defining  properties  of  a  distributed  system,  it  is  convenient  to  refer  to  states; 
our  protocol  uses  events.  For  this  reason,  we  define  both  global  states  and 
consistent  cuts  for  use  in  different  contexts. 

A  property  is  a  predicate  over  the  global  state  of  the  system.  A  stable 
property  is  an  invariant:  once  it  becomes  true,  it  continues  to  be  true1.  The 
most  commonly  studied  examples  of  stable  properties  in  distributed  systems 
are  deadlock  of  a  subset  of  the  processes,  termination  of  a  distributed  com¬ 
putation,  and  the  lack  of  a  token  among  the  processes.  There  are.  of  course, 
other  stable  properties  of  interest.  For  example,  in  a  token-passing  system 
that  can  lose  but  not  generate  tokens,  the  predicate  "there  are  no  more  than 
two  tokens  in  the  system”  is  a  stable  property. 

Let  cr,|$  (read  <r,  relative  to  $)  be  the  values  of  the  subset  of  variables 
of  <r,  that  are  referenced  in  the  formulation  of  property  $.  An  event  e,!  is 
relevant  to  a  property  $  if  ^  that  is,  if  e\  changes  p.'s  local 

state  relative  to  $  by  changing  the  value  of  a  variable  in  the  formulation 
of  $.  For  example,  if  $  is  ,ka  subset  of  the  processes  are  deadlocked”  then 
the  relevant  events  are  those  that  request  a  resource  and  those  that  grant 
a  resource,  since  $  is  formulated  in  terms  of  resource  requests  and  grants. 
Note  that  local  events,  send  events,  and/or  receive  events  can  be  relevant, 
depending  on  how  $  is  formulated. 

2.2  Vector  Clocks 

Our  protocol  is  based  on  a  variant  of  vector  clocks  [Mat89].  In  the  usual 
definition  of  a  vector  clock  V'.  each  event  e,  has  an  n-component  vector 
V'(e,)[l..n]  associated  with  it.  V'(e,)  is  called  the  vector  timestamp  of  et. 
The  components  of  V{e,)  are: 

•  V( e‘ )[i]  =  t\  that  is.  V'(ej)[z]  is  the  number  of  events  that  p,  has 
executed  up  to  and  including  e\. 

‘Some  authors  define  an  invariant  property  to  be  one  that  is  valid  in  all  states  of  the 
svstem. 
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•  V'(e')[j],j  ^  i  is  the  number  of  events  p:  has  executed  that  causally 
precede  e‘. 

As  an  example.  Figure  1  shows  a  space-time  diagram  of  a  two-process  system 
with  the  events  labeled  by  vector  clocks. 


x  :=  1  u  :=  1  x  :=  2 


Figure  1:  Execution  with  vector  clocks. 

A  simple  implementation  of  vector  clocks  has  process  p,  maintain  an 
n-element  vector  V,  of  counters.  Process  p,  increments  lj[i]  whenever  it 
executes  an  event  e,  If  e,  is  a  local  event  or  a  send  event,  then  V(e. )  - 
V',.  If  e,  is  a  send  event,  then  p,  includes  \\  in  the  message.  If  e,  is  the 
corresponding  receive  event,  then  process  p;  sets  Vfc  :  k  j  :  V';[A:j  to  the 
maximum  of  the  previous  value  of  and  the  value  of  V,[k]  in  the  message, 
and  V(e,)  =  Vj. 

The  foUowing  three  relations  hold  between  vector  clocks  and  global 
states,  where  — * •  is  the  happened- before  relation  defined  in  [Lam78].  Equa¬ 
tion  1  defines  the  happened-before  relation  in  terms  of  vector  clocks.  Equa¬ 
tion  2  defines  when  two  events  are  consistent  with  each  other  (we  call  two 
such  events  pairwise  consistent),  and  Equation  3  defines  when  a  set  of  local 

states  {<7i . crn}  produced  by  events  { e i . en}  comprise  a  (consistent ) 

global  state: 

Vi.j:  i  ^  j:  V(e,)[i]  <  V(e;)[ij  =  e.  —  t  1  ■ 


o 


(V'(e,)[i]>  V(ej)[i])A 

(V(ej)[j]  >  =  e,  and  e3  are  pairwise  consistent  (2) 

Vi,j:  V(e,)[t]  >  V(ej)[t]  s  {<71( ..., <7n}  is  a  global  state  (3) 

Equation  2  can  be  derived  by  noting  that  two  events  e\  and  e“  are  incon¬ 
sistent  only  if  (without  loss  of  generality)  e-  — ■  e"  A  3ef  :  e-  -*  ef  —  e“, 
or  in  terms  of  vector  clocks,  V^e*)^]  <  V(e“)(t].  Equation  3  can  be  derived 
from  Equation  2  by  noting  that  all  events  in  a  consistent  cut  are  pairwise 
consistent.  Observe  that  E  =  {<Ti , . . . ,  an)  is  a  global  state  if  and  only  if 
C  =  {ei, . . . ,  en}  is  a  consistent  cut. 

When  vector  clocks  are  used  in  actual  protocols,  not  all  events  cause  a 
process'  vector  clock  to  be  updated.  For  example,  some  causal  broadcast 
protocols  are  based  on  vector  clocks  that  are  updated  only  at  send  or  broad¬ 
cast  events;  other  events  (i.e.,  receive  and  local  events)  do  not  increment  the 
local  component  of  V  [Pet87,BSS90j.  For  our  purposes,  only  the  execution 
of  relevant  events  update  the  local  counter  of  a  vector  clock.  This  is  because 
the  execution  of  a  nonrelevant  event  does  not  change  the  state  of  the  system 
with  respect  to  $. 

If  not  all  send  events  are  relevant  events,  however,  then  Equations  2 
and  3  need  not  hold.  For  example,  suppose  e\  and  e“  are  relevant  events 
that  are  pairwise  inconsistent:  elt  —  ef  A  3e,£'  :  e\  —  ef  —  e“.  If  no  such 
e\'  is  a  relevant  event,  then  V’(e{)[j]  =  V ( e“ )[f]  and  V  (e“)[j]  >  V(e-)[jj. 
which  satisfies  the  left  side  of  Equation  2  even  though  e[  and  e“  are  pairwise 
inconsistent.  On  the  other  hand,  crjj#  =  and  so  the  fact  that  e\  and 

e“  are  pairwise  inconsistent  is  irrelevant  with  respect  to  $.  as  long  as  ef  and 
are  pairwise  consistent. 

We  therefore  define  a  type  of  vector  clock  for  which  a  weaker  version  of 
Equation  2  holds.  Let  the  weak  vector  clock  V$  for  $  be  the  vector  clock 
in  which  V<f(c,)[ij  counts  only  the  events  relevant  to  $  that  p,  has  executed 
through  e,.  Therefore,  the  vector  timestamp  associated  with  several  events 
of  pi  may  have  the  same  value,  but  all  such  events  result  in  the  same  local 


6 


state  relative  to  $.2  For  example,  in  the  case  of  deadlock  the  relevant 
events  are  sending  a  request  for  a  resource,  sending  a  grant  of  a  resource, 
and  receiving  a  grant  of  a  resource  (see  Section  5.2).  If  a  process  p  requests 
a  resource  and  then  sends  an  unrelated  message,  then  the  send  event  does 
not  change  the  local  state  of  p  with  respect  to  possible  deadlock.  So,  the 
send  event  is  given  the  same  weak  vector  timestamp  as  the  resource  request 
event. 

VVe  say  that  two  events  e‘  and  e\  are  equivalent  with  respect  to  4>.  written 
e\  ej  ,  if  e\  and  e\  have  the  same  weak  vector  timestamp.  We  say 
that  two  local  states  er‘  and  erf  are  equivalent  with  respect  to  <f>.  written 
o\  erf,  if  e\  ~<j>  ef .  Similarly,  two  global  states  E  =  {crj1 . of1 }  and 

t*  t* 

S'  =  {erf , . .  .,<rnn}  are  equivalent  with  respect  to  $,  written  S  ~<j>  S',  if  for 
all  i ,  erf  erf.  The  following  versions  of  Equations  1,  2.  and  3  hold  for 
both  vector  clocks  and  weak  vector  clocks: 


Vi.j:  i  j:  V<*(e‘)[t]  <  V*(e“)[i] 

(V*(e')[i]  >  V*(ep[»])  A 
(V*(e;)(;]>  V*(e{)[») 


3e(  .  e“  :  (e[  ~<j,  e[)  A  (e“  ~~<t.  e't 


3  e‘.e“:(e‘  ^  e‘)A(«“  et): 
e\  and  e“  are  pairwise  consistent (5) 
3  global  state  S': 

S'  {crl . ern}  (b) 


Figure  2  shows  weak  vector  clock  values  for  the  execution  shown  in  Fig¬ 
ure  l,  where  we  assume  that  the  predicate  of  interest  references  x  and  y. 
but  not  u  nor  any  of  the  channel  states.  Note  that  although  the  events 
x  1  and  y  3  do  not  form  a  consistent  cut,  their  timestamps  satisfy 
Equation  6  since  there  exist  several  cuts  equivalent  to  this  inconsistent  cut 
(all  necessarily  having  (x  =  1,  y  =  3))  and  they  are  therefore  consistent  with 

2  Note  that  two  events  of  different  processes  may  have  the  same  weak  vector  timestamp 
as  well. 


respect  to  $. 


x  :=  1  u  :=  1  x  :=  2 


Figure  2:  Execution  with  weak  vector  clocks. 


2.3  Locally  Stable  Properties 

Our  protocol  will  detect  a  subset  of  the  stable  properties  that  we  call  locally 
stable  properties.  Informally,  a  stable  property  $  is  locally  stable  if  no  pro¬ 
cess  involved  in  the  property  can  change  its  state  relative  to  $  an  unbounded 
number  of  times  once  $  holds.  For  example,  suppose  $  is  "processes  p,  and 
Pj  are  deadlocked.”  The  property  $  is  locally  stable  because  once  $  be¬ 
comes  true,  neither  pi  nor  p},  the  processes  involved  in  $.  can  execute  any 
event  that  could  affect  $  (e.g.,  requesting  or  granting  a  resource).  Hence. 
<t,|$  and  <7_,|$  remain  constant  once  $  holds. 

More  formally,  let  Q  be  the  set  of  all  global  states  that  the  system  can 
attain.  For  any  E  €  G,  define  E|$  to  be  the  subset  of  E  that  is  referenced 
in  the  formulation  of  $,  and  given  a  set  of  processes  A  define  E  4  to  be 
the  subset  of  E  that  consists  of  the  states  of  the  processes  in  A.  We  will 
call  $  locally  stable  if  it  is  stable  and  if  it  satisfies  the  following  condition: 
consider  any  E  6  Q  that  satisfies  $,  and  let  A  be  the  set  of  processes  p,  such 
that  <7,|$  does  not  change  an  unbounded  number  of  times  in  any  execution 
starting  at  E.  Then,  for  all  E'  €  Q  such  that  E'4j$  =  E4 1 $ .  $  holds  in  E'. 


In  other  words.  $  can  be  determined  from  only  the  states  of  the  processes 
in  A.  Note  that  /I  can  be  empty,  but  only  for  trivial  stable  properties;  if  A 
is  empty,  then  $  can  be  determined  without  knowledge  of  the  state  of  any 
process  or  channel  and  must  therefore  be  valid  or  not  valid  in  all  states.  For 
this  reason,  we  will  assume  in  this  paper  that  A  is  nonempty. 

The  most  commonly  studied  stable  properties — deadlock,  termination, 
and  lack  of  a  token — are  all  locally  stable.  For  example,  if  T  is  a  deadlock 
state,  then  .4  includes  the  deadlocked  processes,  and  so  the  presence  of 
deadlock  can  be  determined  by  considering  only  the  states  of  the  processes 
in  A.  An  example  of  a  stable  property  that  is  not  locally  stable  is  the 
property  “there  are  no  more  than  k  :  k  >  0  tokens"  in  a  system  where 
tokens  cannot  be  created  but  can  be  lost  when  passed.  This  is  because  if 
S  is  a  state  in  which  there  are  k  tokens,  then  every  process  can  execute  a 
relevant  event  an  unbounded  number  of  times  (namely,  it  can  pass  tokens), 
thereby  changing  its  local  state  relative  to  $  an  unbounded  number  of  times, 
and  so  .4  is  empty.  The  property,  however,  is  not  valid  in  all  states  of  the 
token  passing  system.3 

For  most  locally  stable  properties  of  interest,  the  processes  in  .4  cannot 
change  their  local  states  relative  to  $  at  all  once  $  holds  ( l.e. .  the  bound 
on  the  number  of  future  relevant  events  that  they  can  execute  is  zero).  In 
this  case,  if  our  protocol  presented  below  is  initiated  in  a  state  in  which  $ 
holds,  it  is  guaranteed  to  detect  <$.  For  locally  stable  properties  for  which 
the  bound  is  not  zero,  however,  the  protocol  may  not  detect  $  if  initiated 
in  a  state  in  which  $  holds.  However,  the  system  will  eventually  reach  a 
state  in  which  all  processes  in  .4  will  execute  no  further  relevant  events.  If 
the  protocol  is  initiated  in  such  a  state,  it  is  guaranteed  to  detect  <h.  Thus. 

3 .4  need  not  be  empty  for  a  stable  property  to  be  not  locally  stable  For  example, 
suppose  $  is  again  "there  are  no  more  than  k  .  k  >  0  tokens"  and  the  token  passing 
system  consists  of  red  and  green  processors.  Furthermore,  only  red  processors  can  lose 
tokens  and  a  green  processor  never  passes  a  token  to  a  red  processor.  In  this  system,  ,-t 
is  the  set  of  green  Processors  (green  processors  never  execute  a  relevant  event),  yet  the 
validity  of  $  depe  .ds  on  the  states  of  both  the  green  and  red  processors  Hence  is  not 
locally  stable. 

I) 


though  the  protocol  eventually  detects  $  in  all  cases,  it  detects  $  “sooner" 
when  the  bound  is  zero. 

3  Protocol 

3.1  Basic  Protocol 

We  first  assume  that  a  process  po  will  determine  whether  the  global  state  of 

the  processes  P  =  {p, . pn}  satisfies  a  locally  stable  property  <P.  Later. 

we  will  change  this  protocol  so  that  any  number  of  processes  in  P  may 
concurrently  assume  the  role  of  po- 

Our  protocol  is  based  on  the  notion  of  a  consistent  subcut — a  set  of 
events  whose  timestamps  satisfy  Equation  6.  (The  state  of  a  single  process 
is  trivially  a  consistent  subcut.)  Informally,  the  protocol  works  as  follows. 
Whenever  a  process  p,  executes  a  relevant  event  e,,  p,  records  in  a  buffer  5, 
its  local  state  relative  to  $  and  the  vector  time  stamp  V$(et)  as  Bt.cr  and 
B,.V,  respectively.  Process  po  periodically  collects  the  values  of  the  buffers 

in  any  order,  yielding  a  set  B  =  {B^.B? . Bn).  Once  po  has  constructed 

this  set,  po  determines  if  there  exists  a  maximal  consistent  subcut  of  B  such 
that  the  states  associated  with  the  timestamps  in  the  subcut  satisfy  <I>.  If 
Po  can  find  such  a  subcut.  then  $  must  currently  hold.  Note  that  po  need 
not  examine  all  consistent  subcuts;  if  /4'  C  A  and  $  holds  in  £$|,4',  then 
$  will  also  hold  in  E$|T,  so  we  need  examine  only  the  maximal  consistent 
subcuts  of  B.  Of  course,  $  may  be  of  the  form  Vp,  :  4 r ( p, ) .  in  which  case 
only  a  full  consistent  cut  will  satisfy 

Unfortunately,  the  number  of  maximal  subcuts  of  a  set  of  n  weak  vector 
clocks  is  fi(2n).  Fortunately,  it  is  not  necessary  for  po  to  examine  all  of 
these  subcuts.  Suppose  the  set  of  buffer  values  contains  B ,  and  B.  that  are 
inconsistent:  <  Bj.V[i\.  These  two  states  violate  Equation  6.  and 

so  both  cannot  be  part  of  the  same  consistent  subcut.  However,  5,.V’[t]  < 
Bj.V[i]  implies  that  p,  executed  a  relevant  event  between  the  time  that 
B,.c 7  was  recorded  and  the  time  that  B}.o  was  recorded.  Therefore,  po  need 
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•  Each  process  p,  £  P  records  a-t  and  V*(e,)  in  buffer  5,  upon  executing 
a  relevant  event  e,. 

•  Periodically,  po  collects  all  of  the  buffers  5,  and  extracts  from  them 
the  latest  subcut  {at  :  Vj  :  Bj.V[i\  <  5,.Vr[i]}. 

•  po  detects  $  if  $  holds  on  the  latest  subcut. 


Figure  3:  Basic  Protocol 

not  consider  subcuts  containing  B,.a:  if  the  system  is  in  a  state  such  that 
the  processes  involved  in  $  will  execute  no  more  relevant  events,  then  B,.a 
cannot  be  necessary  for  the  detection  of  $  and  so  need  not  be  considered. 
Otherwise,  the  system  will  eventually  reach  such  a  state.  If  Bt.cr  is  involved 
in  determining  then  B ,  will  be  recorded  such  that  B,.  VM  >  B,.V [,] 
for  all  j.  Thus,  given  a  set  of  buffered  values  B  and  the  partial  order 
iBt.  Bj  £  B  :  B,  >  Bj  =f  B,.l'{j]  >  B,.l'[j}.  po  need  only  find  the  greatest 
elements  of  15  with  respect  >.  which  can  be  done  in  filn'l  time.”1  We  call 
this  subcut  the  latest  subcut  of  B.  The  latest  subcut  is  clearly  a  maximal 
subcut,  since  all  states  that  are  not  part  of  the  latest  subcut  are  inconsistent 
with  some  state  in  the  latest  subcut.  This  gives  us  the  protocol  shown  in 
Figure  3. 

The  soundness  of  this  protocol  is  straightforward.  We  now  argue  that  the 
protocol  is  complete  as  well;  that  is,  if  $  holds,  then  our  protocol  will  detect 
4>.  Let  E  be  the  first  global  state  in  which  $  holds.  Since  $  is  locally  stable, 
there  is  a  nonempty  set  of  processes  A  each  of  which  executes  a  bounded 
number  of  relevant  events  after  E;  these  processes  will  not  change  their  states 
relative  to  $  nor  update  their  vector  clocks  an  unbounded  number  of  times 

Whe  greatest  elements  of  >-  can  be  found  by  discarding  any  values  B,  such  that 
3«  :  B, .  V\j]  >  B}. V\j],  which  can  be  done  in  0[  n2 )  time  using  a  straightforward  algorithm 
And,  if  all  values  are  incomparable  then  all  the  values  are  greatest  elements  of  To 
determine  that  they  are  all  incomparable  takes  nJ  comparisons,  and  so  the  problem  is 
Q(n2). 
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in  any  run  starting  at  E.  Suppose  that  po  initiates  the  protocol  in  or  after 
E  (i.e.,  when  $  holds).  Because  the  processes  in  A  can  each  execute  only 
a  bounded  number  of  relevant  events  after  £,  and  because  message  delivery 
time  is  finite,  there  is  some  global  state  E'  reachable  in  finite  time  from  E 
after  which  the  processes  in  A  execute  no  more  relevant  events.  Therefore  p0 
will  eventually  collect  the  states  E$|j4.  From  the  definition  of  >,  the  state 
of  a  process  p,  in  A  must  be  in  any  latest  subcut  constructed  by  po  because 
Pi  will  execute  no  more  relevant  events.  Since  $  is  stable,  E'  satisfies  $,  and 
since  $  is  locally  stable,  $  can  be  detected  by  examining  E'4j<^.  Hence,  po 
will  detect  $. 

3.2  Decentralization 

In  the  above  protocol,  po’s  role  is  to  collect  the  local  states,  determine  the 
latest  subcut,  and  check  if  $  holds  in  this  subcut.  We  can  decentralize  these 
steps  by  collecting  the  local  states  in  a  token. 

Consider  a  token  K  that  consists  of  n  entries  (D\ . Dn)  where  each 

entry  D ,  =  ( B,.a.  B,.V[i}):  that  is.  Dx  will  hold  the  state  of  p,  relevant  to  <1> 
and  the  local  component  of  p,’s  vector  clock  when  it  generated  Bt.cr.  Assume 
that  there  exists  a  special  value  1  for  D ,  indicating  that  the  state  is  not  in 
the  token;  all  of  the  D ,  in  K  are  initially  set  to  X. 

To  determine  whether  $  holds,  a  process  generates  a  token  A',  inserts 
its  state  and  vector  clock  value  into  K,  and  passes  the  token  to  any  other 
process.  When  a  process  p:  receives  a  token  K.  it  takes  the  following  steps: 

1.  Set  Dj  to  (B}.a,  Bj.V[j)). 

2.  For  all  non-1  values  of  Dk  that  are  not  in  the  latest  subcut,  set  Dk  to 

1. 

3.  Determine  whether  the  state  values  in  K  satisfy  $.  If  so.  then  the 
detection  is  made;  otherwise,  p}  forwards  the  token  to  a  process  p^. 
chosen  fairly,  that  has  Dk  =  X.  If  there  is  no  such  process,  then  p. 
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can  either  drop  the  token  or,  when  p:  computes  a  new  value  of  Bj,  p} 
can  restart  at  Step  1  with  this  token. 


Note  that  when  process  pj  executes  rule  2,  Bj  must  be  part  of  the  latest 
subcut;  if  it  were  not,  then  there  would  exist  a  recorded  value  of  B(  in  D( 
such  that  Bj.V[j]  <  B(.V\j j.  This  implies  that  p(  knows  of  a  relevant  event 
executed  by  p:  that  results  in  a  state  causally  after  the  state  recorded  in  B3. 
which  violates  the  definition  of  Bj.  Thus,  only  the  earlier  values  D k  need  be 
tested  with  respect  to  B}.  From  above,  the  value  Bit  in  Dk  can  be  discarded 
if  Bk-V[k]  <  Bj.V[k\.  The  value  Bk-V[k\  is  stored  in  Dk-V,  so  K  carries 
enough  information  for  pj  to  make  this  test. 

The  resulting  protocol  is  summarized  in  Figure  4.  Note  that  we  have 
no  a  priori  restriction  on  how  many  tokens  there  can  be  in  the  system  at 
any  time  or  on  the  order  in  which  the  token  is  passed,  other  than  that  it  is 
passed  in  a  fair  manner.  These  decisions  can  be  made  when  the  protocol  is 
applied  to  a  particular  problem. 

If  this  protocol  is  initiated  in  a  state  in  which  $  holds  and  after  which 
no  process  executes  a  relevant  event,  then  $  will  be  detected  with  no  more 
than  n  token  passes.  However,  if  processes  do  execute  relevant  events  after 
the  protocol  is  initiated,  then  the  initial  detection  may  not  be  successful  and 
the  protocol  must  be  restarted.  If  the  number  of  relevant  events  that  can  be 
executed  after  <£  holds  is  bounded  by  A,  then  detection  can  take  up  to  an 
additional  An  token  passes.  For  large  A,  our  protocol  could  perform  worse 
than  a  snapshot  protocol.  In  practice,  however,  we  do  not  expect  A  to  be 
large. 

4  Termination  Detection 

We  now  instantiate  the  general  protocol  given  above  to  obtain  a  protocol 
that  detects  termination  in  a  distributed  system.  There  are  many  vari¬ 
ations  of  this  property;  the  earliest  that  we  know  of  is  due  to  Dijkstra 
and  Scholten  [DS80].  The  following  definition  is  the  same  as  that  given 
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•  Each  process  p,  £  P  records  cr,  and  V$(et)  in  buffer  B,  upon  executing 
a  relevant  event  et  . 

•  When  pi  wants  to  detect  p,  generates  a  token  { D\  :=  1 Dn 
X),  sets  Z?t  to  (Bi.cr,  Bi.V[i]),  and  forwards  the  token  to  any  other 
process. 

When  pj  receives  a  token: 

•  Pj  sets  Dj  to  (Bj.ct.  ByV[j}). 

•  For  each  Dk  :  k  ^  j,Dk  #  X,  p;  sets  Dk  ■=  X  if  Dk-V  -  Bk.V[k\  < 
BrV{k\. 

•  pj  determines  if  $  holds  on  the  state  values  in  the  token.  If  not,  p: 
forwards  token  to  any  pk  such  that  Dk  =  X. 


Figure  4:  Decentralized  Protocol 


in  [Mis83]. 

All  processes  are  either  active  or  idle.  Only  active  processes  can  send 
messages.  An  active  process  may  become  idle  at  any  time,  and  an  idle 
process  can  become  active  only  upon  receipt  of  a  message.  The  system 
is  terminated  when  all  processes  in  the  system  are  idle  and  there  are  no 
messages  in  transit. 

The  local  state  of  a  process  relative  to  termination  consists  of  whether 
the  process  is  active  or  idle  and  whether  there  is  a  message  on  an  incoming 
channel.  Therefore,  the  events  that  are  relevant  to  termination  are  sending 
a  message,  receiving  a  message,  becoming  idle,  and  becoming  active.  Each 
process  will  update  its  (weak)  vector  clock  upon  executing  any  of  these 
events.  Note  that  for  this  problem,  we  do  not  need  to  keep  track  of  the 
contents  of  the  messages  exchanged  between  processes;  only  the  number  of 
messages  is  important.  To  capture  the  channel  states,  we  have  each  process 
keep  track  of  how  many  messages  it  has  sent  and  received  on  each  adjacent 
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channel.  The  combined  information  of  all  of  the  processes  will  then  yield 
the  number  of  messages  in  transit  on  each  channel:  if  p,  has  sent  more 
messages  to  pj  than  Pj  has  received  from  p,,  then  there  is  at  least  one 
message  on  channel  C,i}.  In  this  way,  we  can  represent  the  relevant  channel 
states  without  recording  an  unbounded  number  of  messages. 

We  instantiate  the  general  protocol  given  in  Section  3.2  as  follows. 

Each  process  p,  maintains  the  following  local  state  variables: 

•  active,:  Boolean  =  true  if  and  only  if  p,  is  active. 

•  send,[l..n]:  Integer  array,  send^fj]  =  the  number  of  messages  that  p, 
has  sent  to  p:.  All  are  initially  0. 

•  recv,[l..n]:  Integer  array,  rect^j]  =  the  number  of  messages  that  p, 
has  received  from  py  All  are  initially  0. 

When  p,  sends  a  message  to  p;.  send,{j)  is  incremented.  When  p,  receives 
a  message  from  p;.  recr,[j]  is  incremented.  When  p,  becomes  active  or  idle. 
active ,  is  set  appropriately. 

At  some  point,  an  idle  process  will  start  the  detection  protocol  by  cir¬ 
culating  a  token  as  described  in  Section  3.2.  The  termination  condition  can 
only  be  evaluated  over  a  total  global  state  (as  opposed  to  a  consistent  proper 
subset  of  the  process  states),  so  a  positive  determination  can  be  made  only 
by  the  process  pj  that  is  the  last  to  add  its  state  to  the  token. 

Process  p/  detects  termination  if  and  only  if  the  following  three  condi¬ 
tions  hold: 

1.  The  timestamps  in  the  token  form  a  consistent  cut: 

2.  All  processes  are  idle:  Vi  :  active ,  =  false: 

3.  There  are  no  messages  in  transit:  'ii.j  :  send,\j]  =  rector]. 

The  following  theorem  and  corollary  show  that  item  1  is  redundant.  The 
theorem  assumes  for  simplicity  that  the  buffered  states  are  not  collected  in 
a  token:  the  corollary  removes  this  assumption. 
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Theorem  1  Let  B  =  {5, :  i  =  1,2 - n}  be  a  a  set  of  buffered  state  values 

that  were  recorded  in  a  system  that  does  not  collect  states  in  a  token  D.  If  in 
B,Vi,j:  sendi[j]  =  recvjfz],  then  the  global  state  defined  by  B  is  consistent: 
Vt,j:  Bi.V{ «]  >  B,.V[i\. 

Proof :  Suppose  by  way  of  contradiction  that  item  3  holds  over  B  but  the 
timestamps  in  B  form  an  inconsistent  cut:  3 iff  :  B{.V[i]  <  B:.V[i}.  Bj.V\i\ 
is  advanced  only  when  p}  receives  a  message,  and  events  local  to  p.  affect 
only  B:.V[j].  Therefore,  in  order  for  BrV[i\  to  advance  beyond  the  recorded 
B,.V[i\,  there  must  have  been  a  chain  of  messages  between  pi  and  between 
the  time  that  B,  was  collected  and  the  time  that  B}  was  collected.  This 
implies  that  there  is  some  k  such  that  the  recorded  send,[k ]  <  rect^fij, 
contradicting  the  assumption  that  item  3  holds. 

□ 

Corollary  2  Let  B  =  {B,:i  =  1.2,  ...n}  be  a  a  set  of  buffered  state  values 
that  were  collected  in  a  token  D.  If  in  D.'ii.j:  send,[j]  =  recvjj'j.  then  the 
global  state  defined  by  D  is  consistent:  ii.j :  . 

Proof:  None  of  the  events  executed  in  collecting  the  buffered  states  into  a 
token  are  relevant.  Hence,  collecting  the  states  in  this  way  has  no  effect 
on  their  consistency  with  respect  to  termination.  The  buffered  states  will 
therefore  be  consistent  with  respect  to  termination  when  items  2  and  3  hold. 

□ 

Corollary  2  implies  that  the  vector  clocks  need  not  be  maintained.  Fur¬ 
thermore,  these  checks  can  be  done  incrementally.  For  example,  we  can 
assign  a  total  order  to  the  processes  and  have  the  token  passed  along  that 
total  order.  When  process  pi;  receives  the  token,  it  tests  to  see  if 

-'activek  A  (Vf :  1  <  I  <  k  :  (sen<4[/]  =  recri[fc])  A  (send([A:]  =  rect’fc (/])). 

If  this  condition  does  not  hold,  then  pi,  can  drop  the  token.  If  the 
condition  holds  and  k  =  n,  then  termination  is  detected:  otherwise,  pk  fills 
in  Dk  and  passes  the  token  to  pi<+\- 
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This  yields  the  protocol  given  in  [Mat87]  as  the  channel  counting  pro¬ 
tocol,  which  requires  only  n  messages  to  detect  termination  once  it  holds, 
and  which  can  be  further  refined  into  a  protocol  that  is  space-efficient.  This 
is  a  good  example  of  how  our  general  protocol,  which  constructs  consistent 
(sub)cuts  explicitly,  can  be  used  to  derive  a  much  simpler  protocol  that 
constructs  consistent  cuts  implicitly. 

5  Deadlock  Detection 
5.1  fc-out-of-m  Deadlock 

We  now  instantiate  the  general  protocol  given  in  Section  3  to  obtain  a  proto¬ 
col  that  detects  k-out-of-m  deadlock  in  a  distributed  system.  This  problem 
was  first  formulated  and  solved  in  [BT84J.  In  this  formulation,  a  process 
can  request  k  resources  from  a  pool  of  m  resources. 

A  process  is  either  active  or  blocked,  where  an  active  process  is  one 
that  is  not  waiting  for  any  other  process.  Active  processes  may  issue  k- 
out-of -m  requests  in  the  following  way.  When  an  active  process  p.  requires 
k  processes  to  carry  out  some  request,  it  sends  request  messages  to  each 
of  the  m  processes  that  can  perform  this  action.  Process  p,  then  becomes 
blocked,  and  waits  until  the  action  requested  is  carried  out  by  at  least  k  of 
the  m  processes.  A  process  can  not  send  any  further  requests  while  blocked, 
but  a  process  can  receive  request  messages  while  blocked. 

Only  active  processes  can  carry  out  a  requested  action.  If  a  process  p, 
receives  a  request  while  active,  it  will  either  become  blocked  or  carry  out 
Pi's  requested  action  within  finite  time.  In  the  latter  case.  p.  will  send  a 
grant  message  to  p,.  When  p,  receives  k  grant  messages,  it  becomes  active 
again.  It  then  relinquishes  the  requests  made  to  the  rest  of  the  processes  to 
which  it  sent  request  messages  by  sending  them  relinquish  messages.  We 
assume  that  the  recipient  of  a  relinquish  message  acknowledges  the  message 
and  that  the  sender  of  a  relinquish  message  waits  for  all  acknowledgements 
before  sending  another  request  message.  By  doing  so.  we  guarantee  that 
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Pi  can  discard  any  grant  messages  received  after  the  first  k  are  received. 

The  state  of  a  process  p,  relative  to  fc-out-of-m  deadlock  consists  of 
the  number  of  grants  needed  for  p,  to  become  active  and  the  current  set 
of  processes  that  p,  is  waiting  for.  We  capture  this  state  by  having  each 
process  keep  track  of  the  processes  on  which  it  is  blocked  and  the  number 
of  grant  messages  that  it  has  sent  and  received  on  each  adjacent  channel. 

We  instantiate  the  general  protocol  given  in  Section  3.2  as  follows.  Each 
process  p,  maintains  the  following  local  state  variables: 

•  kt:  Integer  =  the  number  of  grant  messages  required  for  p,  to  become 
active  (initially  0). 

•  <7_send,[l..n]:  Integer  array,  g.send^j]  is  the  number  of  grant  mes¬ 
sages  that  pi  has  sent  to  p:  (all  are  initially  0). 

•  <7-recr,[l..n]:  Integer  array.  p.recy,[j]  is  the  number  of  grant  messages 
that  p,  has  received  from  p;  (all  are  initially  0). 

•  uf, :  Integer  set.  These  are  the  processes  that  p,  is  waiting  for.  When 
p,  sends  a  request  message  to  p;.  wf,  :=  wf,  :J  {;}:  when  p,  receives 
a  grant  message  from  p;  or  sends  a  relinquish  message  to  p, ,  wf, 

vf,  ~  01- 

Deadlock  is  determined  by  constructing  and  reducing  the  system  waits- 
for  graph.  This  graph  is  constructed  as  follows: 

•  a  waits-for  edge  is  drawn  from  p,  to  p;  if  wf  3  j  A  (p.sendjft]  = 
p.rectv(j])).  That  is.  p,  is  waiting  for  a  resource  from  p;  and  no  grant 
message  is  in  transit  from  p;  to  pt. 

•  the  number  of  grants  k,  needed  for  p,  to  be  unblocked  is  k,  -  | V;  : 
j-send^i]  -  p.rect\(j]|.  That  is.  k,  is  the  number  of  grants  that  p,  is 
waiting  for  less  the  number  of  grants  in  transit  to  p,. 

Deadlock  is  tested  by  reducing  this  graph  as  follows:  if  an  edge  points  from 
p,  to  p;  and  pj  is  active,  then  the  edge  can  be  erased  and  k,  can  be  reduced 


by  one;  and  if  a  process  has  k,  =  0,  then  all  of  its  outgoing  edges  can  be 
erased.  The  system  is  deadlocked  if  and  only  if  there  are  edges  that  cannot 
be  removed  by  following  these  two  rules. 

In  this  system,  the  relevant  events  of  p;  are  those  that  change  wfx ,  it,, 
g.sendi  and  p.recv,.  Hence,  the  relevant  events  are  requesting  a  resource, 
sending  a  grant  message,  receiving  a  grant  message  and  sending  a  re¬ 
linquish  message.  We  can  now  argue  that  fc-out-of-m  deadlock  is  locally 
stable:  a  deadlocked  process  can  execute  only  a  bounded  number  of  relevant 
events  (namely,  it  can  receive  up  to  k,  -  1  grant  messages),  and  any  valid 
global  state  that  contains  the  local  states  of  the  deadlocked  processes  still 
yields  an  irreducible  waits-for  graph. 

The  deadlock  detection  protocol  is  as  follows.  When  a  process  p,  wishes 
to  test  for  deadlock,  p,  generates  a  token,  fills  Dx  with  (( wfx,  g.senc ,  g.reci\ , 
ki),  5,.V'[t]),  and  forwards  the  token  to  some  p;  ^  pt.  Upon  receiving  a  to¬ 
ken.  a  process  pj  sets  D}  to  {{wfy  gsend,,  g.recv:,  k}),  BrY[j})  and  discards 
all  values  D *  that  are  inconsistent  with  B}  by  setting  to  p,  then 
checks  to  see  if  deadlock  holds  on  the  remaining  values  by  constructing  the 
waits-for  graph  and  reducing  it.  If  deadlock  does  not  hold,  then  p;  forwards 
the  token  to  any  process  pt  such  that  Dk  =  1. 

We  can  improve  this  protocol  by  choosing  the  process  to  which  the  token 
is  passed  more  carefully.  Since  we  would  like  to  detect  deadlock  as  quickly 
as  possible,  the  forwarding  process  should  choose  a  process  that  is  likely  to 
add  information  leading  to  the  detection  of  a  deadlock.  A  reasonable  choice 
is  a  process  p}  such  that  D}  =  L  and  such  that  p;  is  in  wft  for  some  D ,  *  _l. 

The  full  protocol  is  presented  in  Figure  5.  We  assume  that  the  process 
p,  that  generates  the  token  does  so  because  it  suspects  that  it  is  involved  in 
a  deadlock;  that  is,  wfx  is  not  empty. 

5.2  RPC  Deadlock  Detection 

1-out-of-l  deadlock  is  a  special  case  of  k-out-of-m  deadlock  that  lends  itself 
to  further  optimization.  This  type  of  deadlock  is  called  RPC  deadlock  be- 


when  pi  receives  token  (D i, . . . ,  Dn): 

begin 

Dt.a  :=  ki,g.sendi,g.recvi,  wfr; 

Di.V  := 

for  all  D}  :  DrV  <  Bi.V[j\ :  D :  :=  1; 
if  there  exists  pj  :  (D:  =  1)  A  (3p,-  :  j  €  toff) 
then  forward  token  to  one  of  these  p; 

else  begin 

construct  waits-for  graph; 
reduce  waits-for  graph; 

if  graph  is  not  fully  reduced  then  signal  deadlock 
else  drop  the  token 
end 
end 


Figure  5:  Protocol  tor  Detecting  fc-out-of-m  Deadlock 


cause  it  can  occur  in  a  remote  procedure  call  system,  where  making  a  remote 
procedure  call  is  analogous  to  requesting  a  resource  from  a  single  processor. 
The  waits-for  graph  is  constructed  as  for  fc-out-of-m  deadlock,  except  that 
k{  =  \wfi\  and  thus  need  not  be  represented  in  the  wait-for  graph.  Fur¬ 
thermore,  relinquish  messages  are  not  needed  and  the  waits-for  graph  is 
reducible  if  and  only  if  it  does  not  contain  a  cycle. 

We  can  instantiate  our  protocol  for  detecting  RPC  deadlock  as  follows. 
As  before,  the  relevant  events  are  requesting  a  resource  (here,  making  an 
RPC  request),  sending  a  grant  message  (here,  sending  the  reply  to  the  RPC 
request),  and  receiving  a  grant  message.  Any  blocked  process  p,  can  decide 
to  detect  deadlock  by  generating  an  empty  token,  inserting  its  buffered  state 
into  £),,  and  passing  the  token  to  the  (single)  process  in  wfx.  When  p;  receives 
a  token  from  pt,  p;  will  verify  that  p,  waits-for  p}  and  will  pass  the  token 
on  to  the  process  blocking  p}.  A  process  detects  deadlock  when  it  receives 
a  token  that  contains  a  complete  cycle.  The  resulting  protocol  is  shown  in 
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RPCDeadlock(pJ ):  cobegin 

do  forever  when  (wf3  ^  0)  and  (waited  too  long) 
create  empty  token  K ; 

K.Dj  :=  (gsendi[wfj\,  BrV[j] ); 
pass  K  to  wf} 

||  do  forever  when  receive  token  K  from  p, 
if  K.Dj  =  1  then 

if  wfJ  0  and  K  .Dx.g.sendi[j)  =  g-recv^i] 
and  VA\£>fc:  K.Dk  £  0:  K.Dk.V[k]  >  B:.V[k } 

then 

K. D,  :=  ( g.send, [ u»/;  j ,  B} .  V[j ] ) : 
pass  K  to  wfj 

else  skip  /*  drop  token  K  */ 

else 

if  K . D wg ,sendk[j]  =  g.recv}[i]  then  detect  deadlock 

else  skip  /*  drop  token  K  */ 

coend 


Figure  6:  RPC  Deadlock  Protocol.  Original 
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Figure  6.  Note  that  if  the  waits-for  graph  contains  a  d-cycle,  then  the  token 
need  be  passed  only  d  times. 

This  protocol  can  be  further  simplified  by  applying  the  following  two 
theorems. 

Theorem  3  If  {K.D,. g  . send,[j ]  =  £_rec»j[t])  then  p,  has  executed  no  rele¬ 
vant  event  since  setting  K.D,. 

Proof  :  Assume  ( K  .D,.g.send,[j]  =  g.recvj[i'\).  The  last  relevant  event  that 
Pi  can  have  executed  before  setting  K.D ,  was  to  send  a  request  message  to 
pj.  The  first  relevant  event  that  p<  can  have  executed  after  setting  K.D,  is 
the  receipt  of  a  grant  message  from  pj.  Since  [K  .D,.gsend,[j]  =  g„recv,{i}). 
Pj  has  sent  no  grant  messages  to  p;  since  p,  sent  the  request  to  p,.  Hence, 
p,  can  have  executed  no  relevant  event  since  setting  K.D,. 

Theorem  4  If  (K .D,.gsend,[j]  =  p.recv,[i])  then  no  process  that  has  set 
its  value  m  K  has  subsequently  executed  a  relevant  event. 

Proof  :  Let  l  be  the  number  of  values  Dk  :  Die  *  J_,  and  assume  that 
( K .D,.g.send,[j]  =  p.neciyfi]).  We  will  use  induction  on  i. 

Base  case  {£  =  l).  Follows  directly  from  Theorem  3. 

Induction  case  (I  >  1).  By  the  induction  hypothesis,  no  process  prior  to 
p,  had  executed  a  relevant  event  when  p,  received  K.  So  process  prior  to 
Pi  can  execute  a  relevant  event  until  p,  does,  and  by  Theorem  3  p,  has  not 
executed  a  relevant  event  since  forwarding  K  to  p, . 

CD 

Theorem  4  implies 

( g.send,[j |  =  p.rect’j[i])  =>  (VA: :  Dk  ^  0:  Dk.V  =  >  Br\'[k}). 

Thus,  the  vector  clocks  can  be  omitted  and  the  token  need  only  carry  the 
identity  of  the  process  that  initiated  the  test  for  deadlock.  The  resulting 
protocol,  shown  in  Figure  7.  first  appeared  in  [CMH83]  specialized  for  m  =  1. 


RPCDeadlock(pj):  cobegin 

do  forever  when  ( wf]  ^  0)  and  (waited  too  long) 
send  (gsendj[wf:},j)  to  wf} 

||  do  forever  when  receive  (s,k)  from  p, 
if  k  j  then 

if  wfj  ^  0  and  5  =  g.recv:[i ] 
then  send  (g.send^wf^.k)  to  wfJ 

else 

if  s  =  g.recV][i]  then  detect  deadlock 

coend 

Figure  7:  RPC  Deadlock  Protocol.  Refined 

The  protocol  in  Figure  6  can  be  easily  generalized  to  detect  detect  and- 
deadlock  (m-out-of-m  requests),  since  a  cycle  in  the  waits-for  graph  is  equiv¬ 
alent  to  deadlock  in  this  case  as  well.  The  only  change  necessary  is  that  when 
Pj  passes  the  token,  it  must  replicate  the  token  and  pass  a  copy  to  each  pro¬ 
cess  in  wfy  With  and-deadlock.  however,  a  process  can  execute  a  relevant 
event  while  deadlocked — a  deadlocked  process  can  receive  a  proper  subset 
of  the  required  grant  messages.  Thus,  if  the  waits-for  graph  contains  a  d- 
cycle,  then  even  if  tokens  are  generated  by  a  deadlocked  process  and  passed 
along  cycles,  such  tokens  may  be  dropped  up  to  n-d  times  before  the  dead¬ 
lock  is  detected.  However,  the  protocol  in  Figure  7  can  be  effectively  run 
in  parallel  by  having  p;  send  (g.sendj [uj,  k)  to  all  the  processes  u  €  wf;  in 
which  case  a  token  passed  along  a  cycle  will  not  be  dropped.  The  resulting 
deadlock  detection  protocol  is  the  one  presented  in  [CMH83]. 

6  Conclusion 

This  paper  defined  a  proper  subclass  of  the  stable  properties  which  we  denote 
the  locally  stable  properties.  This  subclass  is  interesting  in  that  a  process 
that  is  '‘involved”  in  establishing  the  stable  property  is  limited  in  what  it 
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can  do,  and  will  eventually  cease  changing  its  local  state  with  respect  to 
the  stable  property.  Hence,  in  order  to  detect  a  locally  stable  property,  a 
consistent  cut  need  not  be  explicitly  constructed — the  relevant  local  states 
will  form  a  consistent  subcut  implicitly.  This  leaves  only  the  problem  of 
detection. 

In  order  to  make  this  observation,  we  needed  to  define  consistent  cuts 
with  respect  to  a  global  state  predicate,  and  slightly  extend  the  notion  of 
vector  clocks  to  accommodate  our  definition.  We  then  gave  a  simple  and 
decentralized  protocol  that  detects  when  a  locally  stable  property  occurs  in 
a  distributed  system.  The  protocol  can  be  easily  refined,  which  we  illustrate 
by  refining  it  to  a  known  protocol  for  termination  detection,  a  new  proto¬ 
col  for  /r-out-of-m  deadlock  detection,  and  known  protocols  for  m-out-of-m 
deadlock  detection. 

In  the  reductions  to  the  two  known  protocols,  the  vector  clocks  proved 
redundant.  This  was  because  the  processes  involved  in  $  could  execute 
no  relevant  events  once  they  established  the  condition  of  interest,  and  the 
detection  algorithm  also  ensmed  that  the  channels  carried  no  undelivered 
relevant  messages.  In  both  cases,  the  receipt  of  a  relevant  message  was  the 
only  relevant  event  that  a  process  involved  in  $  could  execute,  and  so  an 
empty  channel  between  two  processes  implied  pairwise  consistency  of  the 
recorded  states  of  those  two  processes.  This  observation  is  similar  to  one  of 
the  steps  in  the  refinement  of  a  termination  protocol  given  in  [CM86],  yet 
we  have  not  been  able  to  refine  our  protocol  to  their  termination  protocol. 

The  class  of  locally  stable  properties  was  defined  in  proving  the  protocol 
correct.  We  would  like  to  determine  what  kinds  of  properties  are  locally 
stable.  We  know  of  two  general  classes:  the  locally  stable  properties  of 
distributed  garbage  detection,  termination,  and  global  deadlock  can  all  be 
expressed  as  detecting  no  token  in  a  generalized  token  passing  system,  yet 
deadlock  of  a  subset  of  the  processes  does  not  seem  to  be  so  expressible.  We 
are  interested  in  whether  there  are  other  classes  of  locally  stable  properties. 

Not  all  interesting  stable  properties  are  locally  stable,  however.  For 


example,  the  property  “the  number  of  tokens  is  less  than  k  >  0"  in  a  token 
passing  system  that  can  lose  but  not  regenerate  tokens  is  stable  but  not 
locally  stable.  We  do  not  know  if  there  are  protocols  that  are  more  message- 
efficient  than  snapshot  protocols  for  detecting  such  properties. 

Our  work  was  motivated  by  trying  to  derive  message-efficient  special- 
purpose  detection  protocols  from  a  general  detection  protocol.  We  have  only 
been  partially  successful.  Our  protocol  is  most  efficient  when  no  process  can 
execute  a  relevant  event  after  the  condition  of  interest  holds.  Furthermore,  in 
our  derivation  of  the  m-out-of-m  deadlock  detection  protocol  in  Section  5.2. 
our  protocol  could  generate  0(n)  extra  messages.  Hence,  we  would  like  to 
better  understand  the  notion  of  relevant  events  and  weak  vector  clocks. 
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